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Abstract: 

Data  aggregation  is  an  important  data  process  technique  in  M2M  communications, 
which,  upon  the  request  from  some  specific  application  requirement,  only  transmits  the 
selected/processed  data  to  the  application  domain.  In  this  research  proposal,  we  aim  to 
investigate  secure  data  aggregation  with  fault  tolerance  in  M2M  communications. 
Different  from  previous  research  works  on  secure  data  aggregation,  information  privacy 
and  data  integrity  will  be  simultaneously  integrated  in  data  aggregation,  and  fault 
tolerance  will  be  also  studied  in  this  proposal.  In  specific,  the  novelty  of  this  research 
project  lies  in  the  following  aspects:  i)  develop  new  data  aggregation  schemes  to 
simultaneously  achieve  the  information  privacy  and  data  integrity  in  M2M 
communications;  ii)  develop  new  privacy-preserving  data  aggregation  schemes  with 
fault  tolerance  for  M2M  communications.  This  proposal  will  contribute  to  the  secure 
communications  and  information  exchange  between  sets  of  nodes,  and  the  lessons 
learned  will  also  better  prepare  AOARD  for  establishing  the  strategy  towards  new 
information  security  and  transmission  challenges  in  future  military  M2M 
communications. 


Introduction: 

M2M  communication  is  characterized  by  involving  a  large  number  of  intelligent  devices 
sharing  information  and  making  collaborative  decisions  without  direct  human 
intervention.  Due  to  its  potential  to  support  a  large  number  of  ubiquitous  characteristics 
and  achieving  better  cost  efficiency,  M2M  communication  has  quickly  become  a  market¬ 
changing  force  for  a  wide  variety  of  real-time  monitoring  applications,  such  as  traffic 
surveillance,  smart  metering,  environmental  monitoring,  industrial  automation  and 
military  scenarios  [1][2].  Despite  various  M2M  applications,  the  basic  M2M 
communication  infrastructure  is  quite  similar  and  usually  consists  of  three  parts:  M2M 
domain,  network  domain,  and  application  domain  [3],  as  shown  in  Fig.  1.  In  the 
infrastructure,  the  information  is  generated  by  sensors  in  M2M  domain,  then 
transmitted  through  wire/wireless  network  in  network  domain,  and  finally  through  a 
gateway  to  application  domain,  where  it  can  be  reviewed  and  acted  on.  To  support  this 
kind  of  information  flow  in  M2M  communication,  data  transmission  is  a  critical 
component  in  the  infrastructure.  However,  due  to  huge  data  generated  at  M2M  domain, 
it  is  infeasible  or  cost-inefficient  to  directly  transmit  these  high-volume  data.  Therefore, 
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transmission  of  selected/processed  data  is  expected  in  M2M  communications. 


Sensor 


Wire/Wireless 


®  ® 

Collection  of  data 
M2M  Domain 


Application  Domain 

ft 

Gateway 

Assessment  of  data, 
Response  to  available 
information 


Fig.  1.  Basic  M2M  communication  infrastructure 


Data  aggregation  [4]  is  an  important  data  process  technique,  which,  upon  the  request 
from  some  specific  application  requirement,  only  transmits  the  selected/processed  data, 
e.g.,  count,  sum,  max,  min,  average  values,  to  the  application  domain.  Therefore,  it  can 
largely  reduce  the  transmission  cost  while  still  meeting  the  application  requirement. 
Over  the  past  years,  due  to  its  efficiency,  data  aggregation  has  been  paid  great 
attention,  and  plentiful  data  aggregation  schemes  have  been  proposed  [4] [5].  However, 
many  previously  reported  data  aggregation  schemes  cannot  be  directly  applied  to  M2M 
communications,  partly  because  they  did  not  take  the  unique  characteristics  of  M2M 
communications  into  good  consideration.  Since  sensors  in  M2M  communications  are 
usually  low  cost,  small  size  and  often  deployed  at  unattended  environments,  they  are 
easily  vulnerable  to  malicious  attacks  and/or  sometimes  malfunctioning  [3].  Therefore, 
in  order  to  make  the  data  aggregation  really  workable  in  M2M  communications,  the 
requirements  of  security  and  fault  tolerance  should  be  reinforced  in  data  aggregation. 
We  note  that,  although  some  secure  data  aggregation  schemes  [6] [7] [8]  were  proposed 
in  sensor  networks  to  resist  pollution  attacks,  they  sometimes  do  not  work  well  due  to 
the  lack  of  the  fault  tolerance  property.  Therefore,  secure  data  aggregation  with  fault 
tolerance  still  needs  further  study  in  M2M  communications. 
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Experiment,  Results  and  Discussion: 


(1)  PDA:  A  Privacy-Preserving  Dual-Functional  Aggregation  Scheme  for  Smart  Grid 
Communications 

Abstract 

Privacy-preserving  aggregation  for  smart  grid  communications,  which  precisely  meets 
the  requirement  of  periodically  collecting  users'  electricity  consumption  while 
preserving  privacy  of  each  individual  user,  has  been  extensively  studied  in  recent  years. 
However,  most  of  the  existing  privacy-preserving  aggregation  schemes  only  focused  on 
the  summation  aggregation.  In  this  paper,  based  on  the  lattice  cryptographic  technique, 
we  propose  a  novel  privacy-preserving  dual  functional  aggregation  scheme  (PDA)  for 
smart  grid  communications.  With  our  proposed  PDA  scheme,  each  individual  user  just 
reports  one  data,  then  multiple  statistic  values,  i.e.,  mean  and  variance,  of  all  users  can 
be  computed  by  the  data  &  control  center  in  the  smart  grid,  while  the  privacy  of  each 
individual  user  can  still  be  protected.  Detailed  security  analyses  demonstrate  that  our 
proposed  PDA  scheme  is  secure  and  robust.  In  addition,  extensive  performance 
evaluations  also  show  that  our  proposed  PDA  scheme  is  efficient  in  terms  of 
computational  and  communication  overhead. 


The  conceptual  smart  grid  system  architecture 


System  model  under  consideration 


Major  Features  and  Contributions 

•  PDA  uses  a  homomorphic  encryption  scheme  to  encrypt  users'  data  so  that  the  users' 
privacy  can  be  protected  from  eavesdropping  under  the  defined  attack  model.  PDA 
supports  both  additive  and  multiplicative  aggregations,  which  enables  data  &  control 
center  (DCC)  to  obtain  both  mean  and  variance  of  the  users'  data  with  only  one 
report  sent  by  each  user. 

•  Additional  techniques,  including  multi-bits  ring  LWE  encryption,  encoding  integers  to 
polynomials,  and  super-increasing  sequence  filling,  are  integrated  into  the  optimized 
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PDA  to  further  reduce  the  computational  cost,  take  full  advantage  of  bandwidth,  and 
ease  communication  overhead. 


Performance  Evaluation 

Our  proposed  PDA  scheme  outperforms  the  basic  Paillier-based  aggregation  scheme 
[1,2,4]  in  terms  of  computation  cost  and  communication  overhead. 


A.  Computation  Cosi 

Encryption  cost 

Paillier  encryption  scheme 

0(n2  log  n  log  log  n) 

Our  scheme 

0(nlog2  n) 

Homomorphic 

Paillier  encryption  scheme 

0(n  log  n  log  log  n) 

operation  cost 

Our  scheme 

0(n  log  n) 

Decryption  cost 

Paillier  encryption  scheme 

0(n2  log  n) 

Our  scheme 

0(n  log2n) 

B.  Communication  Cost 


Optional  parameters  setting  of  our  scheme  are  as  fol 

ows, 

n  =  k 

L 

n'  =  (21  -  l)  *  20 

n  log  q/L 

192 

3 

140 

96  log  N  +  1454 

256 

3 

140 

128  log  N  +  2924 

320 

4 

300 

150  log  N  +  3288 

By  setting  n  =  256  for  our  scheme,  it  can  provide  security  which  is  equivalent  to  AES-128 
and  is  enough  for  ordinary  use.  While  for  Paillier-based  scheme,  to  provide  the  same 
security,  the  RSA  modulus  N  should  be  set  to  be  2048  bits.  Under  such  parameters 
settings,  the  comparison  of  communication  overhead  between  our  scheme  and  the 
basic  Paillier-based  scheme  is  shown  as  follows. 


cc 

32  5000  • 


•••$••••  1 28*tog2(x)+2924 
•  y 2= 4096 


^  I  '/moniio-n  (2000  43281  \  (4000.44571 


2000  3000  4000 

users  number  N 


Communication  overhead  comparison  (bits/data) 


(2)  A  New  Differentially  Private  Data  Aggregation  with  Fault  Tolerance  for  Smart  Grid 

Communications 

Abstract 

Privacy-preserving  data  aggregation  has  been  widely  studied  to  meet  the  requirement  of 
timely  monitoring  measurements  of  users  while  protecting  individual's  privacy  in  smart 
grid  communications.  In  this  paper,  a  new  secure  data  aggregation  scheme,  named 
DPAFT,  is  proposed  which  achieves  differential  privacy  and  fault  tolerance 
simultaneously.  Specifically,  inspired  by  the  idea  of  Diffie-Hellman  key  exchange 
protocol,  an  artful  constraint  relation  is  constructed  which  is  different  from  all  the 
existing  similar  works.  Thanks  to  this  novel  constraint,  DPAFT  can  support  fault  tolerance 
of  malfunctioning  smart  meters  efficiently  and  flexibly.  DPAFT  is  also  enhanced  to  resist 
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differential  attacks  which  are  suffered  by  most  of  the  existing  data  aggregation  schemes. 
Moreover,  by  improving  the  basic  Boneh-Goh-Nissim  cryptosystem  to  be  more 
applicable  to  the  practical  scenarios,  DPAFT  can  resist  much  stronger  adversaries,  i.e., 
the  users'  privacies  are  protected  in  the  honest-but-curious  model.  In  addition, 
extensive  performance  evaluations  are  conducted  to  illustrate  that  DPAFT  outperforms 
a  state-of-the-art  data  aggregation  scheme  in  terms  of  storage  cost,  computation 
complexity,  utility  of  differential  privacy,  robustness  of  fault  tolerance,  and  the  efficiency 
of  user  addition  and  removal. 


Residential  Users 

System  model  under  consideration 


Major  Features  and  Contributions 

•  Inspired  by  the  idea  of  Diffie-Hellman  key  exchange  protocol,  we  put  forward  a  novel 
solution  for  fault  tolerance  for  smart  metering.  Unlike  all  of  the  existing  similar 
works,  which  depend  on  the  restricted  relation  of  Er=osi  =  0,  an  artful  constraint 
relation  s0 £P=i Sj  =  1  is  constructed,  where  s0,  and  s,,  for  i  =  1,2,  ...,n,  are  the 
private  keys  of  the  control  center  (CC) ,  and  each  residential  user,  respectively. 

•  By  adding  Laplacian  noise  via  distributed  manner,  DPAFT  is  designed  to  provide 
differential  privacy  by  introducing  distributed  noise  generation  procedure. 
Compared  with  the-state-of-the-art  differentially  private  smart  grid  aggregation 
protocol  [3],  our  protocol  is  more  efficient  due  to  the  elimination  of  heavy 
communication,  computation,  and  storage  overhead  of  future-ciphertexts,  while  still 
provides  high  utility  (i.e.,  low  error). 

•  By  improving  the  basic  Boneh-Goh-Nissim  cryptosystem  to  be  more  applicable  to  the 
practical  scenarios,  our  DPAFT  can  resist  much  stronger  adversary  and  is  highly 
efficient.  Specifically,  by  hiding  the  private  key  p  of  the  basic  Boneh-Goh-Nissim 
cryptosystem  to  the  CC  and  introducing  the  blind  factor  tfor  the  GW  and  the  secret 
key  r  for  the  CC,  respectively,  the  users'  electricity  usage  privacy  is  protected  in 
honest-but-curious  model. 


Performance  Evaluation 

Our  proposed  scheme  is  compared  with  the  state-of-the-art  scheme  proposed  by 
Jongho  et  al.  [3]  as  follow. 


A.  Storage  Cost 


Jongho  et  al.'s  scheme 

Huge  amount  of  memory  buffers  need  to 
be  configured  for  GW  to  store  the  future 
ciphertexts. 

Our  proposed  scheme 

GW  is  just  responsible  for  data  aggregation 
and  packages  relay.  No  special  storage 
requirements  are  needed. 
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B.  Computation  Complexity 


Jongho  et  al.'s  scheme 

The  shared  secret  keys  between  every  two 
users  in  the  k-size  partner  group  should  be 
generated  and  assigned  secretly. 

Two  parts  of  ciphertexts,  i.e.,  the  current 
ciphertext  and  the  future  ciphertext, 
should  be  calculated  and  reported. 

Our  proposed  scheme 

No  need  to  compute  and  assign  the  shared 
secret  keys  among  the  users. 

The  additional  computation  of  future 
ciphertext  is  not  necessary  either. 

C.  Utility  of  Differential  Privacy 


Jongho  et  al.'s  scheme 

The  additional  Laplatics  noise  is  added  to 
each  smart  meter's  future  ciphertext  to 
resist  the  subtracting  attack  of  current 
ciphertext  and  future  ciphertext,  which 
incurs  large  errors. 

Our  proposed  scheme 

Overcomes  the  above  drawback,  thus,  it  is 
of  better  utility. 

The  following  figures  compare  the  actual  total  measurements,  the  noisy  counterparts  of 
the  scheme  of  Jongho  et  al.  and  ours,  respectively,  where  in  each  of  the  figure,  n  and  p 
denote  the  total  number  of  the  household,  and  the  different  ratio  of  malfunctioning 
smart  meters,  respectively. 


Comparison  of  noisy  total  consumption  between 
our  scheme  and  Jongho  et  al.'s  scheme 


Define  one-day-RMSE  (root  mean  square  error),  the  closeness  between  the  sequences 
of  actual  and  noisy  sums  is  RMSE  =  -  ■  Zt=i(fnt  —  rnt)2,  where  T  =  1440  is  the  number 
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of  time  points  of  one  day.  Suppose  the  one-hour-RMSE  of  Jongho  et  al.'s  protocol  and 

ours  bey!  and y2,  respectively.  Lety  =  — .  The  one-hour-RMSE  and  one-day-RMSE  of 

y  2 

our  scheme  and  Jongho  et  al.'s  scheme  are  compared  as  follows. 


1.05^ 


0.005  0.05 


uj  94 

w 

5 


-  ♦  -  Jongho  et  al.’s  Scheme 
— I—  Our  Scheme 


0.005  0.05 


Comparison  of  one-hour-RMSE 


Comparison  of  one-day-RMSE 


Comparison  of  one-hour-RMSE  and  one-day-RMSE 


D.  Robustness  of  Fault  Tolerance 


Jongho  et  al.'s  scheme 

Can  only  support  the  maximum  B  ■  T  long 
period  of  fault  tolerance,  where  B  is  the 
buffer  size  of  the  future  ciphertexts  for 
each  smart  meter  and  T  is  the  report 
interval. 

Our  proposed  scheme 

Support  robust  data  aggregation  with  any 
rational  number  of  malfunctioning  smart 
meters  with  arbitrary  long  fault  period. 

E.  Efficiency  of  User  Addition  and  Removal 


Jongho  et  al.'s  scheme 

Costs  0(k  x  B)  and  0(k)  communication 
overheads  for  one  user  addition  and 
removal,  respectively. 

Our  proposed  scheme 

Only  needs  the  TA  to  reassign  the  key 
materials  for  the  changed  users  (user 
addition  and  removal). 

(3)  DDPFT:  Secure  Data  Aggregation  Scheme  with  Differential  Privacy  and  Fault 

Tolerance 

Abstract 

A  new  secure  data  aggregation  scheme,  named  DDPFT,  is  proposed  for  achieving 
differential  privacy  and  fault  tolerance  simultaneously.  Specifically,  by  introducing  some 
auxiliary  ciphertext  subtly,  a  novel  distributed  solution  for  fault  tolerant  data 
aggregation  is  put  forward  to  be  able  to  aggregate  the  functioning  smart  meter 
measurements  flexibly  and  efficiently  for  any  rational  number  of  malfunctioning  smart 
meters  with  arbitrary  long  failure  period.  Furthermore,  DDPFT  also  achieves  a  good 
trade-off  of  accuracy  (i.e.,  low  error)  and  security  of  differential  privacy  for  arbitrary 
number  of  malfunctioning  smart  meters.  Moreover,  through  decentralizing  the 
computational  overhead  and  the  power  of  the  hub-like  entity  of  the  gateway,  the 
security  of  our  proposed  scheme  is  enhanced  and  the  efficiency  is  improved 
significantly.  In  addition,  extensive  performance  evaluations  are  conducted  to  illustrate 
that  DDPFT  outperforms  the  state-of-the-art  data  aggregation  schemes  in  terms  of 
computation  complexity,  communication  cost,  robustness  of  fault  tolerance,  and  utility 
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of  differential  privacy. 


System  model  under  consideration 


Major  Features  and  Contributions 

•  By  introducing  auxiliary  ciphertext  subtly,  we  put  forward  a  novel  distributed 
solution  for  fault  tolerant  data  aggregation.  The  fault  tolerance  mechanism  put 
forward  by  us  is  more  efficient  and  robust.  By  utilizing  the  auxiliary  ciphertexts,  the 
CC  can  obtain  the  aggregation  of  the  functioning  smart  meters  flexibly  and  efficiently 
for  any  rational  number  of  malfunctioning  smart  meters  with  arbitrary  long  failure 
period. 

•  DDPFT  provides  differential  privacy  by  adding  appropriate  noises  chosen  from 
symmetric  geometric  distribution  to  the  aggregation  data  by  the  GW.  The  proposed 
scheme  supports  differential  privacy  and  fault  tolerance  simultaneously  and  achieves 
a  good  trade-off  of  accuracy  and  security  of  differential  privacy. 

•  Through  decentralizing  the  computational  overhead  and  the  power  of  the  hub-like 
entity  GW  which  is  usually  with  limited  computation  resources  and  is  semi-trust,  the 
security  of  our  proposed  scheme  is  enhanced  and  the  efficiency  is  improved 
significantly. 

Performance  Evaluation 

Different  from  most  of  the  existing  similar  works,  differential  privacy  and  fault  tolerance 
are  taken  into  consideration  at  the  same  time  in  our  scheme.  We  mainly  focus  on  the 
comparison  of  our  proposed  scheme  with  the  state-of-the-art  date  aggregation  schemes 
supporting  differential  privacy  and/or  fault  tolerance. 

A.  Computation  Complexity 

We  extend  Shi  et  al.'s  privacy-preserving  aggregation  protocol  [4]  to  support  fault 
tolerance.  Our  scheme  outperforms  the  scheme  of  [4]  in  computation  complexity  and 
supports  fault  tolerance  only  with  a  little  more  computational  overhead. 

B.  Communication  Cost 


Distribution  A:  Approved  for  public  release;  distribution  is  unlimited. 


Ussf  number  0  0  me  paint  JHrrmiber  0  D  Hire  pant 

DDPFT  Shi  et  al.'s  scheme 

Individual  communication  overhead 


JiTD6 


User  nunisr  Q  0  Tlnepolnt 


Jserrimter  D  Time  point 


DDPFT 


Shi  et  al.'s  scheme 
Overall  communication  overhead 


C.  Utility  of  Differential  Privacy 

The  proposed  scheme  provides  higher  utility  (i.e.;  low  error)  in  terms  of  differential 
privacy  than  the  state-of-the-art  data  aggregation  scheme  of  [3]. 


(a)  n  =  2000,  p  =  0. 15  (b)  n  =  2000,  p  =  0.25 

Comparison  of  noisy  total  consumption  between 
our  scheme  and  Jongho  et  al.'s  scheme 


(4)  A  Novel  Privacy-Preserving  Set  Aggregation  Scheme  for  Smart  Grid 

Communications 

Abstract 

In  this  paper,  we  propose  a  novel  privacy-preserving  set  aggregation  scheme  for  smart 
grid  communications.  The  proposed  scheme  is  characterized  by  employing  a  group  G  of 
composite  order  n  =  pq  to  achieve  two-subset  aggregation  from  a  single  aggregated 
data.  With  the  proposed  set  aggregation  scheme,  the  control  center  in  smart  grid  is  able 
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to  obtain  more  fine-grained  data  aggregation  results  for  better  monitoring  and 
controlling  smart  grid.  Detailed  security  analysis  shows  that  the  proposed  scheme  can 
achieve  privacy-preserving  property  with  formal  proof  in  the  random  oracle  model.  In 
addition,  extensive  experiments  are  conducted,  and  the  results  demonstrate  the 
proposed  scheme  is  also  efficient  in  terms  of  low  computational  costs  and 
communication  overheads. 


Control  Center  (CC)  Gateway  (GW) 


Major  Features  and  Contributions 

•  By  using  a  group  of  composite  order,  we  propose  a  novel  privacy-preserving  set 
aggregation  scheme.  Given  a  threshold  of  electronic  consumption  data,  users  can  be 
divided  into  two  subsets,  then  the  proposed  scheme  can  use  one  single  aggregated 
data  to  aggregate  the  sum  of  electronic  consumption  data  in  each  subset  and  the 
corresponding  subset  size  in  a  privacy-preserving  way, 

which  thus  supports  more  accurate  data  analytics  for  controlling  and  monitoring  in 
smart  grid. 

•  With  formal  security  proof  technique,  we  show  our  proposed  scheme  can  achieve 
each  individual  user's  data  privacy  preservation. 

•  We  implement  our  proposes  scheme  in  Java  and  run  extensive  experiments  to 
validate  its  efficiency  in  terms  of  low  computational  cost  and  communication 
overhead,  and  discuss  the  trade-off  between  the  utility  and  differential  privacy  level. 

Performance  Evaluation 

We  evaluate  our  proposed  privacy-preserving  set  aggregation  scheme  in  terms  of 
computational  cost  and  communications  overheads.  Specifically,  we  implement  our 
scheme  by  Java  (JDK  1.8)  and  run  our  experiments  on  a  Laptop  with  3.1  GHz  processor, 
8GB  RAM,  and  Window  7  platform.  The  detailed  parameter  settings  are  shown  as 
follows. 
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Parameter 

Value 

n 

k  =  512 

G 

G  is  a  subgroup  of  Z*p  of  order  n  =  pq,  where  P  = 
2pq-\- 1  is  a  large  prime,  and  p,  q  are  also  two  primes  with 
|pl  =  \q\  =  k 

N 

1  v  max 

ATmax  =  500 

Ar 

AT  =  50, 100, 150, 200, 250, 300, 350, 400, 450, 500 

A 

A  =  10 

th 

the  threshold  th  is  randomly  chosen  from  [1,  A] 

Parameter  settings 


A.  Computational  Cost 

No  matter  whether  a  user  belongs  to  subset  orU0,  the  average  encryption  at  user 
side  only  takes  3.46  ms,  which  is  extremely  efficient.  The  following  figure  shows  the 
computational  costs  of  aggregation  at  GW  and  decryption  at  CC  varies  with  the  number 
of  user  N  from  50  to  500  with  the  increment  of  50.  From  the  figure,  we  can  see  both  of 
them  are  efficient,  and  the  number  of  users  N  has  a  little  effect  on  the  aggregation  and 
decryption,  after  a  hash  table  used  for  looking-up  in  decryption  is  established  in 
advance. 


(a)  Aggregation  at  GW 


(b)  Decryption  at  CC 


Computational  costs  of  aggregation  and  decryption  varying  with  N 


(a)  Aggregation  at  the  gateway  (b)  Decryption  at  the  control  center 

Computational  costs  of  aggregation  and  decryption  varying  with  n 

B.  Communication  Cost 

When  |p|  =  |q|  =  512,  the  length  of  P  =  2pq  +  lis  1025  bits.  Thus,  any  ciphertext 
(including  q  and  C)  in  the  subgroup  G  of  Zp  is  less  than  or  equal  to  1025  bits. 


Distribution  A:  Approved  for  public  release;  distribution  is  unlimited. 


(5)  Privacy-Preserving  Time-Series  Data  Aggregation  for  Internet  of  Things 
Abstract 

In  recent  years,  the  networking  and  collaboration  among  various  devices  has 
experienced  tremendous  growth.  To  adapt  to  the  trend,  the  concept  of  Internet  of 
Things  (loT)  has  been  paid  great  attention  not  only  from  the  academia  but  also  from  the 
industry.  Due  to  its  potential  to  support  a  large  number  of  ubiquitous  characteristics  and 
achieving  better  cost  efficiency,  loT  can  find  many  applications  in  real  world,  including 
traffic  surveillance,  smart  metering,  environmental  monitoring,  industrial  automation 
and  military  scenarios.  Although  loT  has  attracted  a  lot  of  attention;  and  yet,  despite  all 
the  attention,  has  remained  many  security  and  privacy  challenges.  Since  most  devices  in 
loT  are  often  deployed  at  unattended  areas,  they  are  vulnerable  to  the  physical  attacks 
while  without  being  detected  immediately;  and  the  nature  of  broadcast  in  wireless 
communication  also  makes  an  attacker  easy  to  launch  eavesdropping  attack.  As  many 
research  efforts  have  been  put  on  the  loT  security  challenges,  in  this  chapter,  we  mainly 
focus  ourselves  on  addressing  the  privacy  challenges  in  loT.  To  address  the  privacy 
challenges,  i.e.,  to  protect  individual  device's  data  privacy  in  loT,  many  privacy¬ 
preserving  data  aggregation  schemes  have  been  proposed.  However,  most  of  them  only 
support  one-dimensional  data  aggregation,  which  sometimes  cannot  meet  the  accuracy 
requirement  in  loT  scenarios.  Although  our  previous  work  EPPA  deals  with  the  multi¬ 
dimensional  data  aggregation  [7],  it  may  not  be  well  support  large  space  data 
aggregation.  Therefore,  aiming  at  the  above  challenges,  we  propose  a  novel  privacy¬ 
preserving  time-series  aggregation  scheme  for  loT,  which  is  characterized  by  exploiting 
the  properties  of  group  Z*2  to  support  data  aggregation  for  both  small  plaintext  space 
and  large  plaintext  space  at  the  same  time,  which  thus  is  more  efficient  than  traditional 
data  aggregation. 


Trusted  Authority  Control  Center 


Major  Features  and  Contributions 

•  We  propose  a  novel  privacy-preserving  time-series  aggregation  scheme  based  on  the 
group  Z*2.  The  proposed  scheme  can  use  one  single  aggregated  data  to  obtain  both 
the  small  plaintext  space  aggregation  and  the  large  plaintext  space  aggregation  in  a 
privacy-preserving  way  at  the  same  time. 

•  With  formal  security  proof  technique,  we  show  our  proposed  scheme  can  achieve 
each  individual  node's  data  privacy  preservation. 

•  We  implement  our  proposes  scheme  in  Java  and  run  extensive  experiments  to 
validate  its  efficiency  in  terms  of  low  computational  cost  and  communication 
overhead,  and  discuss  the  trade-off  between  the  utility  and  differential  privacy  level. 
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Performance  Evaluation 

We  evaluate  our  proposed  privacy-preserving  time  series  aggregation  scheme  in  terms 
of  computational  cost  and  communications  overheads,  and  analyze  the  utility  of 
differential-privacy  as  well. 

The  parameter  settings  are  listed  as  follows. _ 


Parameter 

Value 

A 

A  =  1024 

h*2  is  a  group  order  <p(pA)  —  p(p  —  1).  where  p|  =  A 

^max 

Tl-max  =  1000 

71 

n  =  200, 400. 600, 800, 1000 

A 

o 

(N 

II 

0 

£ 

differential  privacy  level  s  =  1.  2, 3 

Parameter  settings 


A.  Computation  Complexity 


(a)  Aggregation  at  the  gateway  (b)  Decryption  at  the  control  center 

Computational  costs  of  aggregation  and  decryption  varying  with  n 

B.  Communication  Cost 

When  |p|  =  1024,  any  ciphertext  (including  q  and  C)  in  group  Z*2  is  less  than  or  equal 
to  2048  bits. 

C.  Utility  of  Differential  Privacy 

We  take  smart  grid  as  an  example  to  elaborate  the  advantages  and  effectiveness  of  our 
proposed  scheme.  Different  from  previously  reported  aggregation  schemes  for  smart 
grid,  our  scheme  can  support  data  aggregation  of  user  measurements  including  not  only 
the  integer  part  (small  plaintext  data  x,  G  [0,30])  but  also  the  decimal  part  (large 
plaintext  data  mt  e  [0,999])).  The  detailed  parameter  settings  are  listed  as  follows. 


Description 

Parameter 

Value 

Number  of  users 

n 

10000 

User  measurement 

Xi.rrii 

{0.000.  0.001,  0.002 . 

29.999.  30.000} 

Differential  privacy  level 

£ 

1,2,3 

Sensitivity  of  small  plaintext  space  data 

A 

30 

Sensitivity  of  large  plaintext  space  data. 

A' 

999 

Parameter  settings  for  the  evaluation  of  utility  of  differential  privacy 


Based  on  the  real  data  for  10000  households,  we  plot  the  traces  of  actual  total 
measurements  and  noisy  total  consumptions  for  small  plaintext  space  and  large 
plaintext  space,  respectively.  We  also  sets,  the  differential  privacy  level,  to  1,  2,  3,  for 
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each  of  the  two  scenarios.  As  it  can  be  seen  from  the  following  figures,  the  larger  £  is, 
the  smaller  noise  will  be  added,  and  then  the  utility  is  higher  while  the  smaller  £  is,  the 
larger  noise  will  be  included,  and  then  the  higher  level  of  the  privacy  can  be  guaranteed. 
Compared  with  the  case  of  £=  3,  the  utility  in  £  =  1  is  lower,  but  it  is  still  acceptable. 
Therefore,  in  real  scenarios,  there  is  a  trade-off  between  the  privacy  and  utility. 


x  10E 


Time 
<»)  *  =  1 

7.  iff  m  1 


Differential  privacy  for  small-plaintext-space  data  aggregation 


x  ID6 


sioT  a  id1 


Differential  privacy  for  large-plaintext-space  data  aggregation 
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(6)  A  Lightweight  Privacy-Preserving  Scheme  with  Data  Integrity  for  Smart  Grid 

Communications 

Abstract 

Smart  grid,  deemed  as  the  next  generation  of  power  grid,  can  efficiently  monitor, 
control,  and  predicate  energy  generation  and  consumption.  However,  the  frequent 
collection  of  users'  consumption  information  in  smart  grid  may  reveal  user's  privacy,  and 
the  tampering  of  smart  grid  communication  may  also  impair  the  data  integrity, 
subsequently  affecting  the  precise  monitoring  and  controlling  at  the  control  center.  In 
this  paper,  to  address  the  aforementioned  challenges,  we  propose  a  lightweight  data 
report  scheme  for  smart  grid  communications,  which  can  achieve  privacy  preservation 
and  data  integrity  simultaneously.  Specifically,  an  efficient  pseudonym  identity-based 
privacy-preserving  report  approach  is  proposed  for  the  control  center  to  obtain  the  fine¬ 
grained  usage  data  of  all  the  users  while  protecting  user's  privacy.  An  online/off-line 
hash  tree-based  mechanism  is  also  designed  to  check  and  assure  data  integrity  of 
communications.  Because  of  the  shifting  of  most  time-consuming  computations  to  off¬ 
line  phase,  the  online  process  is  very  fast  and  efficient  by  performing  merely  the 
lightweight  bottom-up  hash  tree  verifications  to  check  all  users'  data  integrity 
concurrently.  Furthermore,  a  topology-independent  data  report  architecture  is  also 
structured,  which  is  adaptable  for  dynamic  residential  users  to  spontaneously  form 
clusters  and  efficiently  report  data  in  flocks.  Extensive  performance  evaluation 
demonstrates  that  the  proposed  scheme  can  achieve  less  communication  overhead  and 
dramatically  reduce  computational  cost  in  comparison  with  the  existing  schemes. 

Control  Center  (CC)  Gateway  (GW)  Trusted  Authority  (TA) 


System  model  under  consideration 


Major  Features  and  Contributions 

•  A  lightweight  pseudonym  identity-based  privacy-preserving  data  report  approach  is 
proposed.  Different  from  the  existing  data  aggregation  schemes,  in  which  just  the 
sum  usage  data  can  be  obtained  by  the  control  center  (CC),  the  fine-grained  usage 
data  of  all  users  can  be  obtained  by  CC  in  privacy-preserving  way.  Thus,  provided 
that  user's  privacy  is  not  revealed,  with  the  detailed  information,  the  whole  system 
can  be  monitored  and  controlled  more  efficiently  by  CC. 

•  An  online/off-line  hash  tree-based  authentication  and  data  integrity  verification 
mechanism  are  designed.  Most  of  the  computations  of  the  smart  meter  with  limited 
resources  could  be  pre-processed  in  off-line  phase.  Furthermore,  source 
authentication  and  data  integrity  of  all  the  received  usage  reports  can  be  checked 
simultaneously  by  performing  the  bottom-up  hash  tree  verification  procedures. 
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•  A  distributed  and  autonomous  data  collection  architecture  is  structured.  The  users  in 
the  neighboring  areas  can  form  the  cluster  dynamically  and  flexibly,  which  makes  the 
data  report  to  be  topology  independent.  And  extensive  performance  evaluation 
demonstrates  that  the  proposed  architecture  can  achieve  less  communication 
overhead  and  dramatically  reduce  computation  cost  compared  with  the  existing 
similar  schemes. 

Performance  Evaluation 

The  proposed  scheme  achieves  privacy  preservation  and  data  integrity  simultaneously 
for  secure  data  report  with  flexible  topology  of  users  in  RA  for  smart  grid 
communications.  We  mainly  compare  the  performance  of  our  proposed  scheme  with 
the  state-of-the-art  similar  schemes  [5,  6]. 

A.  Computation  Cost 

The  features  comparisons,  time  cost  of  operations,  computation  cost  comparisons,  and 
performance  comparisons  of  computation  cost  are  illustrated  as  follows. 


Proposed  scheme 

Scheme  of  Fan  el  al  [6] 

Scheme  of  Foil  da  et  al  [5] 

/>: 

Yes 

Yes 

Yes 

P: 

Yes 

Yes 

No 

F: 

Yes 

No 

Partial0 

D.  data  integrity:  P,  privacy  preservation:  F.  supporting  data  report  with  flexible  topology. 

0  Because  the  generic  and  simplex  peer-to-peer  coni  muni  cation  architecture  is  considered. 
It  cannot  be  regarded  as  having  achieved  the  fully  flexible  topology. 


Feature  comparison 

Notations 

Descriptions 

Time  Cost 

Cm 

Multiplication 

«0.15  ms 

Ce 

Exponentiation 

«  1.6  ms 

ca 

Addition 

«  0.005  ms 

CaESe 

AES  Encryption 

«  75  MiB/Second 

CaESd 

AES  Decryption 

«  75  MiB/Second 

Cpke 

Public  key  encryption 

«  0.09  ms 

CpKo 

Public  key  decryption 

«  2.28  ms 

cp 

Pairing 

«  1 9  ms 

Ch 

Hash 

ss  0.0038  ms 

Chm 

HMAC 

«138  MiB/Second 

ChMv 

HMAC  Verification 

«138  MiB/Second 

C2dnf 

2-DNF  Formulas  Cryptosystem  Decryption 

«  1 .06  ms 

Time  cost  of  operations 

Protocol 

Cluster  Member  (CM) 

Cluster  Head  (CH) 

Proposed  Scheme  Cm  +  Ch  +  Ca  +  Caese 

w * ( Caesd 

+  Cm  +  2Ce  + 

2Ch  )  +  (w  - 

-  1  )*CH 

Fouda  et  al.  ’s  scheme[2 1  ]  2  *  Ce  +  Cpke  +  Cpkd  + 

w  *  (2 Ce  +  Cpke  +  CpKn  + 

Ch  +  Chm  +  Caese 

Ch  +  Caesd  +  Chmv  ) 

Fan  et  al.  's  scheme  [33]  3 Ce  +  2 Cm  +  Ch  +  Ce  +  Cy 

(3  *  w  —  2 )Cm  +  (w  +  l)Cp  + 

(2.W  +  2)Ce  +  (w  +  1  )Ch  +  C2dnf 

Computation  cost  comparisons 
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(a)  Performance  comparison  of  computation  cost  at  cluster  (b)  Performance  comparison  of  computation  cost  at  cluster 
member  (CM)  side  head  (aggregator)  side 

Performance  comparison  of  computation  cost 


B.  Communication  Overhead 


Performance  comparison  of  communication  overhead 


Performance  evaluations  show  that  our  proposed  scheme  is  indeed  efficient  in  terms  of 
computation  and  communication  cost,  which  is  suitable  for  the  real-time  high-frequency 
data  report  in  smart  grid  communications. 

(7)  A  Lightweight  Data  Aggregation  Scheme  Achieving  Privacy  Preservation  and  Data 

Integrity  with  Differential  Privacy  and  Fault  Tolerance 

Abstract 

To  design  an  efficient  and  secure  data  aggregation  scheme  fitting  real  applications  has 
been  pursued  by  research  communities  for  a  long  time.  In  this  paper,  we  propose  a 
novel  secure  data  aggregation  scheme  to  simultaneously  achieve  privacy  preservation 
and  data  integrity  with  differential  privacy  and  fault  tolerance.  Specifically,  by 
introducing  some  auxiliary  ciphertext  subtly,  a  novel  distributed  solution  for  fault 
tolerant  data  aggregation  is  put  forward  to  be  able  to  aggregate  the  functioning  smart 
meter  measurements  flexibly  and  efficiently  for  any  rational  number  of  malfunctioning 
smart  meters  with  discretional  long  failure  period.  The  proposed  scheme  also  achieves  a 
good  trade-off  of  accuracy  and  security  of  differential  privacy  for  arbitrary  number  of 
malfunctioning  smart  meters.  In  the  proposed  scheme,  a  novel  efficient  authentication 
mechanism  is  also  proposed  to  generate  and  share  session  keys  in  a  non-interactive 
way,  which  is  leveraged  for  AES  encryption  to  achieve  source  authentication  and  data 
integrity  of  the  transmitted  data.  Furthermore,  through  decentralizing  the 
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computational  overhead  and  the  authority  of  the  hub-like  entity  of  the  gateway,  the 
security  of  our  proposed  scheme  is  enhanced  and  the  efficiency  is  improved 
significantly.  Finally,  extensive  performance  evaluations  are  conducted  to  illustrate  that 
the  proposed  data  aggregation  scheme  outperforms  the  state-of-the-art  similar  schemes 
in  terms  of  computation  complexity,  communication  cost,  robustness  of  fault  tolerance, 
and  utility  of  differential  privacy. 
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Major  Features  and  Contributions 

•  By  introducing  auxiliary  ciphertext  subtly,  we  put  forward  a  novel  distributed 
solution  for  fault  tolerant  data  aggregation.  Unlike  most  of  the  existing  similar  works, 
which  depend  on  the  central  trust  authority  to  trace  and  separate  the 
malfunctioning  smart  meters  from  the  functioning  ones  to  be  able  to  aggregate  the 
smart  meter  measurements  in  case  of  report  failures,  our  proposed  scheme  supports 
fault  tolerance  of  malfunctioning  smart  meters  without  the  participation  and 
restriction  of  any  external  factors.  Specifically,  utilizing  the  auxiliary  ciphertexts,  CC 
can  obtain  the  aggregation  of  the  functioning  smart  meters  flexibly  and  efficiently 
for  any  rational  number  of  malfunctioning  smart  meters  with  arbitrary  long  failure. 

•  Observing  the  fact  that  user's  private  data  may  often  suffer  from  differential  attacks, 
our  proposed  scheme  provides  differential  privacy  by  adding  appropriate  noises 
chosen  from  Symmetric  Geometric  distribution  to  the  aggregation  data  by  GW.  To 
the  best  of  our  knowledge,  most  of  the  existing  similar  works  cannot  support 
differential  privacy  and  fault  tolerance  at  the  same  time.  A  handful  of  literatures 
trying  to  address  this  problem  only  consider  the  scenarios  that  there  is  small  amount 
(or  fixed  maximum  number)  of  malfunctioning  smart  meters  to  be  able  to  add 
appropriate  noises  to  support  differential  privacy.  Our  scheme  supports  differential 
privacy  and  fault  tolerance  simultaneously,  and  achieves  a  good  trade-off  of 
accuracy  and  security  of  differential  privacy  for  arbitrary  number. 

•  By  integrating  a  pair  of  identities  and  private/public  keys  of  two  communication 
parties,  and  current  time  slot  for  data  report,  a  novel  efficient  authentication 
technique  is  proposed  to  flexibly  generate  and  share  session  keys  in  non-interactive 
way.  The  shared  session  key  is  leveraged  for  AES  encryption  to  achieve  source 
authentication  and  data  integrity  of  transmitted  data.  The  security  analysis  and 
performance  evaluation  indicate  that  the  proposed  mechanism  can  efficiently  and 
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effectively  prevent  the  malicious  adversary  from  impairing  and  polluting  (e.g., 
modify,  forge,  inject,  reply  and/or  delay,  etc.)  the  transmitted  data. 

•  Through  decentralizing  the  computational  overhead  and  the  power  of  the  hub-like 
entity  GW,  which  is  usually  with  limited  computation  resources  and  is  semi-trust,  the 
security  of  our  proposed  scheme  is  enhanced  and  the  efficiency  is  improved 
significantly.  Specifically,  only  the  encryption  of  the  usage  data  and  the  auxiliary 
ciphertext  are  aggregated  and  processed  beforehand  by  at  least  two  users, 
respectively,  can  they  be  reported  to  GW.  In  addition,  through  comparative 
performance  analysis,  we  demonstrate  that  our  proposed  data  aggregation  scheme 
outperforms  the  state-of-the-art  similar  schemes  [3]  in  terms  of  computation 
complexity,  communication  cost,  robustness  of  fault  tolerance,  and  utility  of 
differential  privacy. 

Performance  Evaluation 

The  proposed  scheme  achieves  privacy  preservation  and  data  integrity  simultaneously 
for  secure  data  aggregation  with  differential  privacy  and  fault  tolerance  for  smart  grid 
communications.  We  mainly  compare  the  performance  of  our  proposed  scheme  with 
the  state-of-the-art  similar  schemes  [4,  5,  6]. 

A.  Computation  Complexity 

The  features  comparisons,  computation  cost  comparisons,  and  performance 
comparisons  of  computation  cost  are  as  follows. 
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Computation  cost  comparisons 
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(a)  Performance  comparison  of  computation  (b)  Performance  comparison  of  computation 
cost  at  user  side  cost  at  aggregator  side 

Performance  comparison  of  computation  cost 


B.  Communication  Cost 


no1 


Performance  comparison  of  communication  overhead 


C.  Utility  of  Differential  Privacy 

Based  on  real  data  of  2000  households,  we  compare  the  utility  of  differential  privacy  of 
our  proposed  scheme  with  the  state-of-the-art  one  [3].  The  following  figures  illustrate 
the  traces  of  the  actual  total  measurements,  the  noisy  counterparts  of  both  [3]  and  our 
proposed  scheme,  for  the  different  parameters,  where  in  each  of  the  figure,  n  and  p 
denote  the  total  number  of  the  household,  and  the  different  ratio  of  malfunctioning 
smart  meters,  respectively.  As  it  can  be  seen  from  the  figures,  the  larger  the  number  of 
p,  the  more  accurate  of  our  scheme  comparing  with  the  scheme  of  [3]. 
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(a)  n— 2000,  p=0.05,  £=0.5  (b)  n=2000,  p=0.15,  £=0.5 


(c)  n=2000.  p=0.25,  £=0.5 


(d)  n=2000,  p=0.05,  £=1  (e)  n=2000,  p=0.15,  £=1  (f)  n=2000,  p=0.25,  e=  1 


(g)  n=2000,  p=0.05,  e=2  (h)  n=2000,  p=0.15.  £=2  (i)  n=2000,  p=0.25,  £=2 

Comparison  of  noisy  total  consumption  between  the  proposed 
aggregation  protocol  and  Jongho  et  al.'s  protocol  [3] 


Let  the  1-h  root-mean-square-error  (RMSE)  of  Jongho  et  al.'s  protocol  [3]  and  our 

proposed  scheme  be  yx  and  y2  respectively.  The  ratios  of  y  =  —  with  p  under  different 

y  2 

privacy  level  e  are  depicted  in  the  following  figure,  which  shows  that  comparing  with  [3], 
our  proposed  scheme  always  achieves  better  utility  due  to  much  lower  errors  in  each 
circumstance. 


Comparison  of  1-h  RMSE  between  the  proposed 
aggregation  protocol  and  Jongho  et  al.s  protocol  [3] 
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Through  the  above  comparison,  we  can  see  that  our  proposed  data  aggregation  scheme 
provides  higher  utility  (i.e.,  low  error)  in  terms  of  differential  privacy  than  the  state-of- 
the-art  data  aggregation  scheme  of  [3]. 
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